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ABSTRACT 

In the effort to characterize the masses, radii, and atmospheres of potentially habitable exoplanets, 
there is an urgent need to find examples of such planets transiting nearby M dwarfs. The MEarth 
Project is an ongoing effort to do so, as a ground-based photometric survey designed to detect ex- 
oplanets as small as 2R® transiting mid-to-late M dwarfs within 33 pc of the Sun. Unfortunately, 
identifying transits of such planets in photometric monitoring is complicated both by the intrinsic 
stellar variability that is common among these stars and by the nocturnal cadence, atmospheric varia- 
tions, and instrumental systematics that often plague Earth-bound observatories. Here we summarize 
the properties of MEarth data gathered so far, emphasizing the challenges they present for transit 
detection. We address these challenges with a new framework to detect shallow exoplanet transits in 
wiggly and irregularly-spaced light curves. In contrast to previous methods that clean trends from 
light curves before searching for transits, this framework assesses the significance of individual tran- 
sits simultaneously while modeling variability, systematics, and the photometric quality of individual 
nights. Our Method for Including Starspots and Systematics in the Marginalized Probability of a 
Lone Eclipse (MISS MarPLE) uses a computationally efficient semi-Bayesian approach to explore the 
vast probability space spanned by the many parameters of this model, naturally incorporating the 
uncertainties in these parameters into its evaluation of candidate events. We show how to combine in- 
dividual transits processed by MISS MarPLE into periodic transiting planet candidates and compare 
our results to the popular Box-fitting Least Squares (BLS) method with simulations. By applying 
MISS MarPLE to observations from the MEarth Project, we demonstrate the utility of this framework 
for robustly assessing the false alarm probability of transit signals in real data. 
Subject headings: stars: low-mass — planetary systems — methods: data analysis — eclipses - 
techniques: photometric 



1. INTRODUCTION 

Observationally, nearby M dwarf stars offer both op- 
portunities and challenges as exoplanet hosts. M dwarfs' 
low masses and small sizes accentuate the radial veloc- 
ity wobble and eclipse depths of any planets that transit 
them. Their low luminosities result in habitable zones at 
much smaller orbital distances than for more luminous 
stars, so planets in M dwarf habitable zones are more 
likely to transit and will transit mor e frequently. These 
advantages aid the initial discove ry dNutzman & Char 



bonneau 2008 Blake et al. 2008) and the later detailed 



characterization (e.g. Deming et al.|2009| of planets that 
could be small enough and cool enough to potentially 
host life. Mid-to-late M dwarfs offer a particularly com- 
pelling balance in that they have smaller statures than 
earlier-type stars but are still sufficiently bright to enable 
high precision followup studies, unlike later-type objects. 

Exploiting this opportunity, the ground-based MEarth 
Project is using robotic, 40 cm telescopes to photomet- 
rically monitor nearby (< 33 pc), mid-to-late M dwarfs. 
MEarth has been operating since 2008 with eight tele- 
scopes on Mt. Hopkins, AZ, and will soon include 8 
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additional telescopes in the Southern hemisphere. By 
design, MEarth intends to be sensitive to planets as 
small as 2R^ and with periods as long as 20 days, reach- 
ing the habitable z ones of these stars (see Nutzman &; 
Charbonneau! ^!008 ) . Like MEarth, several additional 



ground- based surveys are attempting to capita lize on the 
M dw arf advantage, including PTF/M-dwarfs (Law et al. 



20li| , TR APPIST (|Jehin et al.|2lHlj|Bonfils et al.|2012 ) 
APACHE dGiacobb c et al. 20T5f , and the WFCAM Tran- 
sit Survey (Ncf s et al.||2012p . 

MEarth's first discovered transitin g planet, the 1.6 day, 
2.7 R m , 6.6 M ffi exoplanet GJ12I4b (Charbonneau et al. 
20091, is far too hot for habitability! But as the first 
planet in this size range accessible to atmospheric char- 
acterization, GJ121 4b has proven a useful laboratory for 
theoretical work (e.g Miller- Ricci & Fortncy 2010; Rogers 
fc Seager||20T0l |Nettelmann et ai.||Ml| |Menou||^()l^ 



Millcr-Ric ci Kempton et al~] 20121) an d for observationa 
studies, both from the ground (e.g. Bean et al.| 2010 
20TT] [Carter et al.|[2^TT] |Berta et al.pOlll |Kundurthy 

ilc 



et al.||2lJTT | 

Mooij eta 



Berta et al 



Croll et al H2 01 1| ICrossfield et al.||201lTlcTe 
'2012' and from space ( |besert et al. 2011' 



2012 ). Yet, its period is still very short 



What are the prospects for finding planets with longer 
periods, potentially habitable planets? 

In light of the relative ease with which the space-based 
Kepler Mission can fi nd transiting planets with periods 



longer than 100 days (Batalha et al. 2012), it is impor 



tant to emphasize that 10-20 day habitable zone periods 
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are long enough to pose significant detection challenges 
from the ground. Planets with these periods, even if 
geometrically aligned to transit, may offer only a single 
transit per season that can be observed from a single site, 



between weather losses and daytim e gaps (e.g., Pepper & 
|Gaudi|2005t |von Braun et al.|2009[ ) . The scarcity of tran- 
sits poses a two-fold problem: multiple transits are often 
necessary to build up sufficie nt signal-to-noise for detec- 
tion (e.g. Bakos et al. 2010), and multiple transits are, 
at some point, almost always necessary for determining 
a planet's period. 

MEarth attempts to address the first challenge with a 
novel, automated "real-time trigger" mode of operation. 
This aids our ability to establish sufficient signal-to-noise 
to detect planet candidates from one or very few tran- 
sits. While observing a target at low-cadence, MEarth 
can rapidly identify in-progress, marginally significant, 
single transit events from incoming observations. If an 
in-progress event crosses a low (3ct) threshold, MEarth 
can automatically trigger high-cadence followup t o con 



firm the candidate eyent at higher confidence (see Nutz- 
man fc Charbonneau|2008||rrwin et al.|2009b[ ). If a tran- 



sit is real, the triggered observations could magnify its 
significance from mediocre to ironclad, without having 
to wait to observe subsequent transits. If no transit is 
present, the triggered observations generally wash out 
the importance of the original downward outliers. 

The ability to confirm single events at high significance 
is crucial to MEarth's goal of finding long period planets. 
Our recent discovery of LSPM J1112+7626, a bright 0.4 
+ 0.3 M Q d ouble-lined eclipsin g binary in a 41 day or- 
bital period ( |Irwin et al.|2011b ) highlights this point. We 
identified LSPM J 1112+7626 from three exposures taken 
during a single primary eclipse. Due to the deep (> 10%) 
eclipses, we were confident the system was real. In par- 
allel with continued photometric monitoring, we began 
radial velocity observations, and the combination of these 
two efforts ultimately established the binary's 41 day pe- 
riod. We envision the discovery of a long period planet 
to follow the same trajectory: a shallow (1%) transit 
could be identified confidently using high-cadence obser- 
vations from the real-time trigger, and follow-up scrutiny 
could be invested to measure the planet's period. This 
two-part strategy is the only way a ground-based survey 
like MEarth will have sufficient sensitivity to find planets 
with periods longer than 10 days. 

For this strategy to work, we need a robust method for 
accurately assessing the significance of individual transit 
events, both initially to trigger high-cadence observations 
of marginal events and later to assess whether an event 
is significant enough to warrant period-finding follow-up. 
This problem would be straightforward if MEarth's tran- 
sit light curves exhibited no noise other than perfectly- 
behaved, uncorrelated, Gaussian, photon noise. This is 
not the case. MEarth light curves show astrophysical 
noise from the M dwarfs themselves, in the form of ro- 
tational modulation due to starspots or sporadic stellar 
flares. They show instrumental noise, such as that caused 
by pointing drifts, focus changes, and flat-fielding errors. 
They show extinction effects from Earth's dynamic at- 
mosphere, some of which, as we discuss here, pose par- 
ticularly pernicious problems for photometry of red stars. 
Often, these noise sources can mimic both the amplitude 
and the morphology of single planetary transits. 



To invest MEarth's follow-up efforts wisely, we need 
a conservative method for assessing the significance of 
a transiting planetary signal in the face of these com- 
plicated noise sources. This method needs to both (a) 
suppress, remove, or correct for stellar variability and 
systematics to increase sensitivity to shallower transits 
and (b) accurately propagate the uncertainties associated 
with this cleaning process into the significance assigned 
to the candidate signal. This method needs to be able 
to do so, even if only one or few transits are observed. It 
does not need to accurately determine the period of a sig- 
nal; we postpone that endeavor for the eventual follow-up 
of statistically promising candidates. 

The exoplanet literature is teeming with well- 
established methods for cleaning variability and system- 
atics from transit survey light curves and for searching 
those cleaned light curves for periodic transit signals. 
However, to our knowledge, of those methods appropri- 
ate for ground-based observations none is sufficiently well 
suited to this challenge of estimating the significance of 
individual transits events. In this paper, we propose a 
new method, one that searches for transits simultane- 
ously with a light curve cleaning process, so that the 
significance of candidates is marginalized over the clean- 
ing's uncertainties. 

If the rate of planet occurrence around mid-to-late M 
dwarfs rises sharply toward smaller planet sizes and lon g 
periods, as it does for FGK stars (iHoward et al. 2012), 



then the development of even small improvements to our 
ability to detect shallow, rare transits could have a big 
payoff for MEarth. Additionally, the development of this 
method also provides a framework with which to estimate 
the ensemble sensitivity of the survey as a whole, thus 
enabling a statistical study from MEarth on the popu- 
lation of planets orbiting nearby mid-to-late M dwarfs. 
We intend to describe the results of such a study in a 
forthcoming work. 

We begin by introducing the MEarth survey with a 
description of the observations we have gathered so far 
(Section |2]). After reviewing the light curve cleaning and 
transit detection techniques that have been described in 
the literature to date (Section [3]), we outline our new 
framework for MEarth, describing both how to estimate 
the significance of a single transit event and how to in- 
corporate well-characterized single events into periodic 
planet candidates (Section |4|. We test this method with 
simulations of injected transits and demonstrate that the 
candidates generated by its application to the existing 
MEarth dataset have the statistical properties we would 
expect (Section [5j). We conclude by suggesting other 
potential applications and improvements that could be 
made with this method (Sections [6] and Yfh. 

The reader should note that throughout this paper we 
use the terms "eclipse" and "transit" completely inter- 
changeably, referring to a planet passing in front of its 
star as seen from Earth. 

2. OBSERVATIONS 

To frame the observational problem we hope to ad- 
dress, we summarize the properties of the photometric 
data gathered by the MEarth Project since beginning 
its full operation in 2008. For completeness, we reiter- 
ate some of the points described in the M Earth design 
strategy (Nutzman & Charbonneau 2008[ ), emphasizing 
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the qualitative features of the MEarth data that present 
particular challenges to our goal of detecting transits of 
habitable super-Earths. 

2.1. The Observatory 

Each of MEarth's eight telescopes is an f/9 40- 
cm Ritchey-Chretien mounted on a German Equatorial 
mount. The telescopes are located in a single enclosure 
with a roll-off roof at the Fred Lawrence Whipple Ob- 
servatory (FLWO) at Mount Hopkins, Arizona. They 
are robotically controlled and observe every clear night, 
except for instrument failures. Due to the summer mon- 
soon in Arizona, we never observe during the month of 
August when FLWO is closed, and we rarely gather much 
useful data in July or September. 

Each telescope is equipped with a 2048 x 2048 CCD 
with a pixel scale of 0.76" /pixel, for a 26' field of view. 
Our target list contains 2,00 nearby M dwarfs (selected 
from Lepine & Shara 2005 1 that are spread all across 
the Northern sky (d > CPJ, so they must be observed 
one-by-one, in a pointed fashion. The held of view is 
large enough to contain ample comparison stars for each 
MEarth target, with typically at least ten times as many 
photons available from comparisons as from the target. 

We use a custom 715 nm longpass filter, relying on the 
quantum efficiency of our back-illuminated e2v CCD42- 
40 detector to define the long-wavelength response of the 
system. Extending out to 1000 nm, the shape of this 
resp onse resembles a com bination of the Sloan i + z fil- 
ters ( |Fukugita et al.|[l996 ). The broad wavelength range 
of this filter was designed to maximize our photon flux 
from M dwarfs, but it introduces an important system- 
atic effect into our photometry, as outlined in Section 
12X51 

2.2. Weather Monitoring 

MEarth continuously monitors the conditions on Mt. 
Hopkins with a suite of weather sensors. At ground level, 
we measure temperature, humidity, and wind speed, as 
well as rain and hail accumulation. We detect cloud cover 
with a wide-angle infrared sensor (a TPS-534 thermopile) 
that measures the sky brightness tempe rature at wave- 
lengths > 5.5/im (see Clay et al. 19981. The primary 
purpose of this monitoring is to prevent damage to the 
telescopes by keeping the observatory closed during in- 
clement weather, but the timeseries from this monitoring 
are also useful in later analysis for identifying weather- 
related systematics in our data. 



2.3. Calibrations 

To go from raw images to reduced light cur ves, we fol- 
low th e procedure and use modified code from |Irwin et al.| 
( 2007 ). Here, we review those points in the process where 
calibration error could potentially lead to light curve sys- 
tematics. 

2.3.1. Non-linearity 

The MEarth CCD's behave slightly non-linearly at all 
count levels, increasing up from a 1 — 2% non-linearity at 
half of the detector full well up to 3 — 4% near the onset 
of saturation. Because we often need to use comparison 
stars with different magnitudes than our target star, we 



must account for this non-linearity. When setting expo- 
sure times, we avoid surpassing 50% of the detector's full 
well, and estimate a correction for the non-linearity using 
sets of daytime dome flats taken with different exposure 
times. With these measures in place, we see no evidence 
that non-linearity limits our photometric performance. 

2.3.2. Dark Current and Persistence 

We scale dark exposures taken at the end of each night 
to remove some of the CCD dark current. However, un- 
til 2011 we operated our Peltier-cooled detectors at -20° 
to -15° C, and at these warm temperatures they showed 
significant persistence. That is, images of bright stars 
would persist as excess localized dark current in subse- 
quent images, slowly decaying with a half hour timescale 
below an initial 1% fraction of the original fluence. This 
was a significant source of systematics: stars in incom- 
ing exposures could land on the same pixels as persistent 
ghost stars from previous exposures, thus gaining a hid- 
den amount of flux that depended on how recently and 
strongly those pixels were illuminated. As this effect de- 
pends on the entire recent 2-dimcnsional illumination his- 
tory of the detector, correcting for it would be extremely 
complicated. We partially mitigated the persistence by 
ensuring different target stars were observed on different 
regions of the detector, but could not eliminate problems 
due to overlap with comparison stars or due to changes 
in cadence. Updating our camera housings in 2011, we 
now operate at -30° C where the amplitude of the per- 
sistence is lower. We also adopt a detector preflash be- 
fore each exposure; this increases the overall dark current 
but suppresses localized persistent images. Between the 
lower temperature and this preflash step, persistence no 
longer has a substantial effect on MEarth photometry. 

2.3.3. Flat-field Sensitivity Map 

We gather flat-fields at evening and morning twilight, 
typically 8 per telescope per twilight, with empiric ally set 
exposu r e tim es estimated using the equations of Tyson 
|fc Gal| ( |1993[ ). To average out large-scale gradients in 
the illumination, we always take adjacent pairs of flats 
on opposite sides of the meridian, which has the effect 
of rotating the whole optical system relative to the sky 
(thanks to our German Equatorial mounts). Because 
our optical system shows high levels of centrally concen- 
trated scattered light (10 — 15% of sky before 2011 and 
< 5% after; see Table [lj that corrupts the large-scale 
structure in twilight flat exposures, we estimate the sen- 
sitivity in the detector plane in two steps. First, we esti- 
mate a small-scale sensitivity map that accounts for dust 
donuts and pixel-to-pixel variations in the detector sensi- 
tivity by filtering out large-scale structure from the com- 
bined twilight flats. Second, we derive a large-scale map 
from dithered photometry of dense star fields to account 
for the non-uniform illumination across the field of view. 
Additionally, our camera's leaf shutter takes a finite time 
to open and close, resulting in a varying exposure time 
across the field of view (on a 1 second exposure, the am- 
plitude of this effect is 5%); we apply a shutter correction 
estimated from sets of twilight flats. Altogether, our flat- 
fielding procedure achieves a precision of 1% across the 
entire detector. 

However, because we hope to perform photometry 
down to the level of 0.1%, this 1% knowledge of the 
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Figure 1. One week of light curves of all M dwarf targets observed on all MEarth telescopes (red lines), along with a 30-minute median- 
binned estimate of their shared behavior (black line, with uncertainty estimates). This "common mode" shows significant variations both 
within and between nights. We attribute this phenomenon to variations in the precipitable water vapor above our telescopes changing 
the effective shape of our wide bandpass, effectively causing more extinction for red M dwarfs than for their bluer comparison stars. The 
common mode correlates strongly with measured humidity and sky temperature (see Figure [2j. 
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Figure 2. The relationship between the shared "common mode" 
behavior in MEarth photometry and ground-level relative humid- 
ity (left) and the difference between sky and ambient temperatures 
(right), both very rough (and not necessarily linear) tracers of the 
total precipitable water vapor in the overlying column. For the 
entire 2011-2012 season, we show each quantity averaged over in- 
dependent half hour intervals (see Figure [TJ as gray points. Error 
bars indicate the mean and its standard error for the common mode 
in subdivisions of humidity or sky temperature. In our bandpass, 
M dwarfs appear fainter when levels of precipitable water vapor 
are higher. 



sensitivity across the field is still imperfect and will in- 
evitably be a source of systematics in our light curves. 
One unavoidable problem is that our German Equatorial 
mounts require the detector to flip 180° when crossing 
the meridian. In light curves, this causes offsets as large 
as 1% between opposites sides of the meridian, as stars 
sample different regions of the large-scale sensitivity of 
the camera. Notably, the step-function morphology of 
this systematic can mimic a transit ingress or egress. In 
addition to this "meridian flip" problem, we achieve a 
blind RMS pointing accuracy of 60-120". To improve 
on this, at each pointing we take a short binned image 
and use its astrometric solution to nudge the telescope 
to the correct pointing before science exposures, with a 
random error typically of 1-2" . This minimizes the im- 
pact of these pointing errors, but does not completely 
remove the problem of stars sampling different pixels on 
an imperfectly flat-fielded detector. 

2.3.4. Differential Photometry 



We perform aperture photometry on all sources in the 
field of view. For each exposure, we derive a differential 
photometric correction from point sources in the field us- 
ing an iterative, weighted, clipped fit that e xcludes vari- 
able stars from the comparison sample (see Irwin et al. 
2007 for details). We calculate a theoretical uncertainty 
estimate Othc(£), m magnitude^ for each point: 



0~the(t) 



2.5 
In 10 



As 



'sky 



A9 



where N 7 is the number of photons from the source, <7g ky 
is an empirically determined sky noise estimate for the 
photometric aperture that includes read a nd dark noise , 
C7g cint is the anticipated scintillation noise flYoung|l967l ), 
and Ccomp accounts for the uncertainty in the compari- 
son star solution. In some MEarth fields with very few 
comparisons, the cr^ omp term can be a significant contri- 
bution to the overall uncertainty. 

2.3.5. Precipitable Water Vapor 

A crucial assumption of this differential photometry 
procedure is that atmospheric or instrumental flux losses 
are exactly mirrored between target and comparison 
stars. Our wide 715-1000 run bandpass overlaps strong 
telluric absorption features due to water vapor, so as the 
level of precipitable water vapor (PWV) changes in the 
column over our telescopes, their effective wavelength re- 
sponse will also change. Red stars will experience a larger 
share of this time- variable PWV-induced extinction than 
stars that are blue in this wavelength range. As a typical 
MEarth field consists of one very red target star (median 
target r — J = 3.8 J] amongst much bluer comparison 
stars (median comparison i — J = 1.3), most MEarth M 
dwarfs exhibit systematic trends caused by this second- 
order extinction effect. This PWV problem has been 
noted be fore as a limitation for cool objects observed in 
the NIR (B ailer- Jones k L am m | 2"003} |Blake et al |2008|). 



201 1[ ) showed that OPS water 
e used to correct for the 



Recently, Blake &: Shaw 
vapor monitoring could 
fluence of PWV variations, improving both relative and 

4 Technically, we co nvert from re l ative flux uncertainties into 
magnitude space as in |Naylor et al] | |2002[ l, to which Eq. [T] is an 
accurate Taylor approximation. 

5 We take r mag nitudes from the Carlsberg Meridi an survey 
jEvans et al.||2002| , and J magnitudes from 2MASS flSkrutskie] 
|et al.|^UUb| ). 
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absolute photometric accuracy of SDSS red star photom- 
etry. 

While we do not have a GPS water vapor monitor, 
we can track the impact of PWV variations on MEarth 
photometry using the ensemble of observations we gather 
each night, observations of red stars in fields of blue com- 
parisons. Figure [T] shows all of the M dwarf light curves 
gathered by MEarth over one week, after applying basic 
differential photometry. These light curves (of different 
M dwarfs observed on different telescopes) move up and 
down in unison, reflecting water vapor changes in the at- 
mosphere they all share. These trends correlate strongly 
with ground-level humidity and ambient sky tempera- 
ture, which are rough tracers of PWV in the overlying 
column. As PWV variations within a night can mimic 
transit signals (e.g. the first panel of Figure fl]), we must 
account for this effect when searching for planets. For- 
tunately, because these trends are shared among all our 
targets, we can estimate a "common mode" timeseries 
from the data themselves and use it to correct for these 
trends (see Section [4]). 

2.4. Science Observations 

The observations of our target M dwarfs are sched- 
uled automatically using an ad hoc dynamic scheduling 
algorithm. This algorithm weights the observability of 
targets with the usefulness of the data to the survey as 
a whole, prioritizing gap-free cadences while minimizing 
slewing overheads. Each star is tied to a particular tele- 
scope, for ease of calibration and light curve production. 

To inform this scheduling, we estimate masses, radii, 
and eff ective temperatures for all stars in the MEarth 
sample ( Nutzman & Charbonneau 2008 ). Based on these 
estimates, we set the observational cadence to be suffi- 
cient to obtain two in-transit points from a mid-latitude 
transit of habitable zone planet. Because M dwarfs are 
dense stars, their transit durations are short (typically 
about 1 hour), requiring us to observe each star once 
every 20 minutes. 

Based on our estimated stellar radii, we set our expo- 
sure time for each star so that we will record as many 
photons as are necessary to for the transit of a 2R e 
planet to have a 3a transit depth. In cases where the 
required exposure time exceeds 2 minutes or would cause 
the peak counts in the star to exceed half of the detector 
full-well capacity, we split the observation into multiple 
sub-exposures. Stars requiring more than 7 minutes per 
pointing are never observed. If the time to reach 2R ffi 
is less than 60 seconds (i.e. bright, late M dwarfs), we 
artificially increase the exposure time. For the analy- 
ses presented in this paper, we combine all observations 
taken in a single pointing using scaled inverse-variance 
weighted means. 

The scheduler input list can be updated in real-time, 
allowing us to "trigger" high-cadence observations of the 
egress of interesting transit events that are detected in 
progress. By immediately gathering more observations 
in candidate transits, we can greatly magnify the signifi- 
cance of an initial 3er detection or refute it entirely with- 
out having to wait for future transits. As currently im- 
plemented, the real-time trigger assesses the significance 
of ongoing transits after subtracting a fixed systematics 
model and harmonic variability model and inflating the 
theoretical error on the in-transit mean with an uncer- 



tainty estimate on the baseline out-of-transit level that is 
exponentially weighted toward recent observations. This 
practical estimator may eventually be replaced by the 
method explored in this paper. Skimming each star with 
a minimal cadence and triggering on marginal candidates 
maximizes our overall efficiency and increases our sensi- 
tivity to long-period planets. 

2.5. Morphological Description of the Light Curves 

A typical MEarth light curve for a target M dwarf con- 
tains roughly 1000 observations spanning one observa- 
tional season. Most of the time, the 20 minute cadence 
is continuous within each night for the time a star has 
a zenith distance < 60°, but could be faster than this 
for up to several hours if a trigger occurred on the star. 
The cadence might also contain gaps within a night due 
to passing clouds or if a trigger occurred on another star 
observed by the same telescope. On longer timescales, in 
addition to gaps for daylight, light curves contain days- 
to months-long gaps from weather losses, instrumental 
failures, and scheduling conflicts (proximity to the Moon, 
other targets with higher priorities). 

One useful summary of the challenge MEarth light 
curves present is our achieved RMS scatter. For all M 
dwarfs observed in the past four years, we show in Fig- 
ure|3]both the RMS predicted with our CCD noise model 
(Eq. [T]) and the RMS actually achieved after basic dif- 
ferential photometry has been performed. To emphasize 
the implications for planet detection, we cast the RMS 
in terms of the size of planet that could be identified 
at 3cr confidence in a single observation, given our stel- 
lar radius estimates^] This choice of parameter space 
(over the more common RMS vs. apparent magnitude 
space) reflects two of MEarth's unique aspects. First, 
we know more about our target stars than most wide- 
field surveys, enabling this translation from RMS (and a 
corresponding detectable transit depth) into detectable 
planetary radius. Second, because we set exposure times 
individually for each star, our sensitivity to transits does 
not depend on stellar apparent magnitude. The panels in 
Figurep3]show variations from year to year (see also Table 
[I]), but all seasons of MEarth observations show signifi- 
cant gaps between the predicted and achieved noise. This 
indicates that stellar variability and systematics domi- 
nate over photon noise, highlighting the need for robust 
method to correct for these complicated noise sources in 
our search for transits. 

The source of excess scatter is sometimes known and 
sometimes unknown. By eye, some of the excess noise 
is clearly astrophysical, e.g. sinusoidal modulations from 
starspots rotating in and out of view or flares abruptly 
appearing then slowly decaying. Some is clearly instru- 
mental, in that it can be associated with externally mea- 
sured variables like position on the detector, weather pa- 
rameters, or the behavior of other stars. Photometric 
outliers can often be associated with wind shake, where 
images exhibit broad and misshapen point spread func- 
tions. And lastly, some of the noise appears simply as 
unstructured excess scatter; this is caused either by as- 
trophysical variability on timescales shorter than 20 min- 

6 Specifically, the vertical axis in Figure [3] is given by \/3a X i?*, 
where <r is the relative flux uncertainty in a single observation (ei- 
ther predicted or achieved) and is the estimated stellar radius. 
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Figure 3. The per-point RMS photometric uncertainty as predicted from a CCD noise model (blue open circles) and the RMS actually 
achieved in the raw differential photometry (orange filled circles) as a function of estimated stellar radius for all MEarth targets. We cast 
the photometric uncertainty for each M dwarf target in units of the planet radius corresponding to a 3tr transit depth, given the inferred 
stellar radius; contours of constant RMS are shown for reference (dashed lines, equally spaced from 0.2 to 1%). Unlike wide-field surveys, 
we set exposure times individually for each target, minimizing the importa nce of apparent magnitude in these plots. The achieved RMS is 
shown before any treatment of systematics or stellar variability; see Figure [111 for comparison. One season of MEarth photometry is shown 
in each panel; Table [T] explains the causes of many of the year-to-year variations. 



Table 1 

Evolution of MEarth Hardware/Software 



Season Notes 



2008- 2009 Telescopes were operated purposely out-of- 

focus to minimize readout overheads, and ex- 
posure times were generally set to the max- 
imum for a single (defocused) image, about 
250,000 photons. The real-time trigger did not 
operate on-sky. 

2009- 2010 After repeated focus mechanism failures re- 

sulted in many light curves experiencing large 
focus drifts, telescopes have been operated in 
or near focus since early in this season with 
the use of sub-exposures to avoid overexposure. 
Smaller stars were prioritized in the scheduling 
queue, as noticeable in Figure[3] The real-time 
trigger began operating in November but was 
not always active due to development efforts. 

2010- 2011 In an attempt to remove systematics due to 

PWV, we operated during this year with a nar- 
rower filter (715 — 895nm, roughly Iq in shape) 
designed to avoid strong telluric water features. 
Unfortunately, the interference cutoff of this fil- 
ter was found to be sensitive to humidity and 
temperature, resulting in larger common mode 
variations and higher systematic noise in the 
light curves (see Figures [3] and [6]l . Scattered 
light was also more pronounced with these fil- 
ters, spurring our multipart flat-fielding pro- 
cedure. The real-time trigger improved in its 
response time and its treatment of variability 
and the common mode. 

2011- 2012 We returned to using the original MEarth 715 

nm long-pass filter, but maintained the soft- 
ware improvements developed from the previ- 
ous year. Dark flocking material affixed to the 
telescope baffles suppressed some of the scat- 
tered light. The real-time trigger operated nor- 
mally for most of the season. 

utes or by unidentified systematics. 

3. BACKGROUND 

The problem of finding and assessing the significance 
of transiting exoplanet candidates in stellar photometry 
is an old one, and one that has already met many suc- 
cessful solutions. At their core, the majority of these 



solutions are variants of the matched filt er Transit De- 
tection Algorithm originally proposed by |Jenkins et al.| 
(1996), in which detection statistics are generated by 
matching light curves to families of templates consisting 
of periodic trains of transit-shaped pulses. The simplest 
and most intuitive o f these methods is the Box-fitting 
Least Squares (BLS; |Kovacs et"aL 20021, which models 



transits as simple boxcars in otherwise hat light curves. 
BLS identifies interesting candidates by folding individ- 
ual photometric observations to trial periods, searching 
a grid of transit epochs and durations at each period, 
and picking the parameters that maximize the transit 
depth sig nificance in a l east-s quares or \ 2 sense. As dis 



cussed by Aig rain et al. 102004 ), many other matched filter 
methods \uayle et al.|2OTTf|Def;iy et al.|2001l|Aigrain fc| 



Favata||2002l Street et al. 112003 (Jarpano et a 



2003 are 



essentially generalizations of BLS 

By assuming a flat out-of-transit light curve, BLS by 
itself can have a tendency to fold up any (non-planetary) 
time-correlated structures into seemingly significant can- 
didates, when applied to real, wiggly light curves. As 
such, BLS is often paired with some sort of pre-search 
cleaning step to remove trends that could be caused ei- 
ther by instrumental effects or intrinsic stellar variability. 

To deal with systematics, algorithms such as t he Trend 



Filtering Algorithm (TFA; |Kovacs et al |2005[) and the 



principal compone nt analys is-like Systematics Removal 
method (SysRem; Tamuz et al. 20051 were developed 
to remove trends that are present m multiple stars in a 
field and thus presumably not astrophysical. These algo- 
rithms use linear combinations of comparison star light 
curves to minimize the scatter in target stars. While 
these methods can remove trends without explicit knowl- 
edge of their causes, the trends do sometimes cluster into 
fami lies that can be identified with physical processes 
(e.g. Kim et al.|2009 ). Unfortunately, strategies like TFA 
that work by constructing templates out of large num- 
bers of field stars are of limited use for MEarth, with its 
small field of view and the substantial spectral type dif- 
ference between our targets and comparisons. Methods 
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that include known physical effects t hrough linear mod- 



els of externally measured variables (Bakos et al. 2010 
Ofir et al.|[2010| are more helpful for JVLEarth-like data. 

Of course, systematics can also generally be minimized 
by improving various elements of the photometric reduc- 
tion, observational strategy, or instrumentation. When 
it can be done at reasonable cost, this is always prefer- 
able to applying filtering methods after the fact, because 
filtering inevitably suppresses the desired signal in addi- 
tion to the noise. 

To clean stellar variability from light curves, many 
methods were developed in preparation for space tran- 
sit surveys like CoRoT and Kepler, where p recision pho- 
tome t ry makes it a dominant concern (e. g. Defay et al. 
2001] |Jenkins|[2002l [Carpano et al.l[2003] ]Aigram fc Ir- 
win||2004| |Regulo et al.| 2007[ |Bonomo fc Lanza[[2 008 ) . 
These methods operate in the time, wavelet, or Fourier 
domains; many of them assume uniform photometric un- 
certainties and uniform cadence, as can realistically only 
be achieved from spac e. Running median filters (e.g. 
Aigrain fe Irw in 2004) or piece- wise polynomial/spline 
fits (e.g. Croll et al.||2007[) have also proven effective for 
removing smooth variability from high S/N light curves. 
Ground-based surveys for planets in open clusters moti- 
vated new methods to remove large a mplitude variability 
from light curves with diurnal gaps ([Street et al. 2003 



BramidTeFan2005; Bur ke et al. (2006) | Aigrain et al.|2007 
Millcr ~ al.||2008[ ), often by fitting series of sinusoids or 
allowing slowly varying baselines. We refer the r eader 
to reviews and comparisons of the se me thods by |Tin- 
gley l (1 20031), I Aigrain fe Irwinl (120041), and IMoutou et al. 
(2005f~ 

JNASA's spaced-based Kepler Mission pub lished the 
first Earth-sized planets ( |Fressin et al. | [2012|) and over 
2,300 transiting planet candidates ( |Batalha et al. 20121 
at the time of this writing. This success is thanks both 
to the design and stability of the spacecraft and to the 
sophistication with which the Kepler team accounts for 
its noise sources. To identify candidate transiting planets 
and assess their significance, Kepler employs a wavelet- 
based matched filter that is both opt imal and efficient 
( |Jenkins|200"2{|Tenenbaum et al.|20"l2| ). The Kepler Pre- 
search Data Conditioning pipeline can also disentangle 
instrumental systematics from stellar variability using a 
linear model like the ones above paired with a Maxi- 
mum A Posteriori approach employing empirical priors 
on the decor relation coefficients to prevent over- fitting 
(PDC-MAP; |Smith et al.|2012||St^nipe et al.|2012| . Un- 
fortunately, due to the need tor uniformly spaced data 
to run the wavelet filter, the applicability of the Kepler 
transit-search method is limited in ground-based obser- 
vations. 

The coupling of many of the above cleaning methods 
with the BLS search has proven extremely successful for 
wide-fi eld surveys. Using these meth ods, surveys such a s 
TrES (|Alonso et aTl2004}, HA TNet (|Bakgs_et_aL| 2C^, 
XOj |McCullough e t al. 20 05|. WA SP (Pollacco eTaL l 
2006| and KELT ( |Siverdet al.||2012] |Beatty et al.||2012| ) 



have made the ground- based detection of hot gas giants 
transiting Sun-like stars routine. Amidst these successes, 
why should we bother to develop new methods? 

Most of the above cleaning methods that are suitable 
for use from the ground work by subtracting some opti- 
mized model for systematics and variability from a target 



light curve. Subtracting this model inevitably introduces 
some extra uncertainty to the light curve: a cleaned light 
curve cannot possibly be as reliable as a light curve that 
did not need to be cleaned in the first place. However, 
these methods generally do not include a route for prop- 
agating the uncertainty from this cleaning into the signif- 
icance of candidate transits. When many transits will be 
folded into into a planet candidate, this is okay. It is suf- 
ficient to know the average effect the cleaning has on the 
light curve, for example, that global filtering with TEA 
suppresses transit depths by 20% on average in HATNet 
(Bako s et al.|2012 ). In contrast, when only a single tran- 
sit is available, knowing the average effect is not enough. 
We need to know: to what extent can we say that any 
one given dip is a bona fide eclipse and not the result of 
over- or under-correction by the cleaning process? 

We need a method that escapes the clean-first, search- 
later dichotomy of many of the previous methods. If the 
cleaning and the search are a two step process, the search 
knows nothing about how the cleaning has suppressed or 
exaggerated the apparent significance of transit-like fea- 
tures, making establishing rigorous detection thresholds 
very difficult. We need a reliable way to include our un- 
certainty in the corrections we make for systematics and 
variability in our search for planets; one way to do this 
is to combine the steps together, allowing the search to 
know about all the complicating details that go into the 
cleaning. 

Here, we present a new method to detect single transits 
and robustly assess their significance. With 10-20 day M 
dwarf habitable zone planets offering at most a handful 
of observable transits per season, the ability to identify 
promising candidates with one or very few transits is 
absolutely necessary to our success. Here, we present a 
method for folding single transits into phased planet can- 
didates, but we do not focus extensively on the problem 
finding the true periods of candidate systems. Although 
the challenge MEarth faces is not a s bad as for the most 
sparsely sampled light curves (s ee Dupuy fe Liu 2009 
Tingley 2011 Dzigan & Zucker 2012[ ), it will generally 
be extremely difficult to find accurate periods for 10-20 
day planets from MEarth survey data alone. Rather, our 
goal is to be able to assess which candidates have high 
enough significance that they warrant the allocation of 
follow-up resources to, eventually, establish their periods 
and confirm their planetary nature. 

4. INVESTIGATING A SINGLE ECLIPSE: MISS MARPLE 

We start by assessing the significance of an individ- 
ual transit event within the context of a single night of 
observations of a star. We do so in the context of a pa- 
rameterized, generative model for each target star light 
curve. This model contains parameters describing a sim- 
ple box-shaped eclipse model, as well as parameters de- 
scribing systematic effects plaguing the light curve and 
the star's intrinsic stellar variability. 

Of the many parameters in this model, the depth D of a 
putative planetary eclipse is particularly important. We 
are interested in answering the following question: given 
a hypothetical lone planetary transit, with an epoch pe 
and duration px, what is the probability distribution 
of the planetary eclipse depth D that the data imply? 
The integral of the normalized probability distribution 
P{D\pe,Pt) over the range D > would provide a mea- 
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sure of the detection significance of the single eclipse. 
While P(D) could generally take on any shape, we will 
approximate its shape to be Gaussian, so that we can 
completely characterize the distribution with two num- 
bers, the maximum probability depth D and a width a. 
In usual astronomical parlance, if D/a > n then we have 
detected the eclipse "at na." 

We want P(D) to be conditional only on the parame- 
ters pe and pr] it should be marginalized over all other 
parameters to account for the additional uncertainty that 
each of these add to the width of the distribution. That 
is, we want P(D) to be the Marginalized Probability of a 
Lone Eclipse (MarPLE), whose Gaussian width we will 
refer to as CMarPLE- In particular, because the transit 
depth could conceivably be quite correlated with the stel- 
lar variability or systematics parameters, marginalizing 
over these parameters will be crucial for a robust measure 
the eclipse depth uncertainty and thus the significance of 
the detection. To achieve this goal, we outline a Method 
to Include Starspots and Systematics in the Marginalized 
Probability of a Lone Eclipse (MISS MarPLE) below. 

4.1. The Model 

At the core of MISS MarPLE is a model that attempts 
to describe every aspect of a single night of MEarth pho- 
tometry of a single M dwarf. We use d(t) to refer to 
the "data" sampled at time t: the relative flux measure- 
ments of the target star after basic differential photomet- 
ric corrections have been applied. The two main aspects 
of the model are an idealized, noiseless light curve m{t) 
and the uncertainty associated with a data point at any 
given time er(i). This model is generative, in the sense 
that fake light curves created with this model aim to be 
statistically equivalent to real MEarth light curves. Even 
if the model is an incomplete description of d(t), it will 
still be useful for estimating the significance of a given 
candidate by allowing us to fit for and marginalize over 
the model parameters. 

Throughout the following sections, light curves such as 
m(t) and d(t) will be expressed in magnitudes, so that 
effects that are multiplicative in flux can be described as 
linear models. 

We write the model for the idealized, noiseless light 
curve as 

m(i) = S(t) + V(t) + P(t) (2) 

where S(t) models trends caused by instrumental sys- 
tematics, V(t) models the variability of the star in the 
absence of planetary transits, and P(t) models the signal 
from a hypothetical transiting planet. 

4.1.1. Systematics Model 

The S(t) term in Eq. |2| enables us to include system- 
atic trends that show clear correlations with externally 
measured variables. We construct S(t) as a linear com- 
bination of N sys relevant external templates: 

N Bya 

S(t) = J2s j E j (t). (3) 

Here Ej(t) represent timeseries of the external variables, 
sampled at the times as the photometric observations, 
and the Sj are systematics coefficients. For MEarth, at 



a bare minimum, we include N sys = 6 terms in this sum: 
the "common mode," the "meridian flip" , and the x and 
y pixel positions on either side of the meridian. 

E>CM{t) : The common mode template is constructed 
from the ensemble of raw M dwarf light curves from 
all telescopes and accounts for photometric trends 
that are shared in all MEarth M dwarf photome- 
try (due to PWV variations, see Figures [I] and 111). 
The effect is stronger for redder stars; forJVlEarth 
targets, the best fit values of the coefficient s CM 
correlates with stellar r — J color. 

E meT id(t)'- To account for stars sampling different regions 
of the detector when observing at positive or neg- 
ative hour angles with MEarth's German Equato- 
rial mounts, we include a "meridian flip" template. 
This template is simply defined as for observa- 
tions taken in one orientation and 1 for observa- 
tions in the other, thus allowing light curves on 
two sides of the meridian flip to have different base- 
lines m 

E x ,i{t) and E y ^(t) for i=0,l: The pixel position tem- 
plates are simply the x and y centroids of the tar- 
get star on the detector, with their medians sub- 
tracted. Two sets are required, one for each side 
of the meridian. Correlations with these templates 
could arise as pointing errors allow a star to drift 
over uncorrected small-scale features in the sensi- 
tivity of the detector (e.g., transient dust donuts). 

Additional external variables may also be used as sys- 
tematics templates, such as FWHM or airmass. With 
MEarth, we find these variables are correlated with the 
photometry for only a few fields, and are usually ex- 
cluded. 

4.1.2. Variability Model 

The V(t) term in Eq. [2] describes the variability of the 
star throughout one night, independent of the presence of 
a transiting planet. Such variability includes fluctuations 
due to rotating spots (smoothly varying on the 0.1 to 100 
day timescale of the star's rotation period) and flares 
(impulsively appearing, with a decay timescale typically 
of hours). The morphology of this variability can be quite 
complicated; we use a simplified model to capture its key 
features, writing 

V(t) = Wnight + 





/2ttA 


m 


+ V cos COS 




\v P J 



^flares 

E fM ( 4 ) 

The first term f n i g ht allows each night to have its own 
baseline flux level. By itself, this term can capture 
most of the variability from stars with long rotation peri- 
ods, where the flux modulation from starspots smoothly 
varies over timescales much longer than one night. By 

7 In practice, we also allow additional offsets corresponding to 
each time a camera is taken off of its telescope. This is implemented 
as a simple extension of the -E mer idM term described here. 
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fitting for a different u n i g ht f° r each night we can piece 
together the variability of the star on timescales > 1 
day as a series of scaled step functions. The harmonic 
i> s in and v cos terms capture variability with period of up 
and become especially important for stars with shorter 
rotation periods. Because we fit a separate w n ; g ht for 
each night, there can be substantial degeneracy between 
the harmonic terms and the nightly offsets, especially 
for slowly rotating stars. We d iscu ss this issue, as well as 
how we estimate vp in Section [4. 2| Although we only in- 
clude one harmonic of the fundamental period vp in these 
sinusoidal terms, additional harmonics could be included 
if the data warranted them. 

The final term in Eq. [4] includes contributions from 
-^flares hypothetical stellar flares fj(t) that may or may 
not be present within the night. Flares are suppressed 
in MEarth's re latively red bandpass, but not completely 
eliminated (see |Tofnemire et aL 2012[ ). The main purpose 
of the flare term is to identify those nights of photome- 
try that may be corrupted due to the presence of flares. 
While wc could in principle model a night that contained 
both flares and a transit, wc find this to be very difficult 
in practice, due to the morphological complexity flares 



sometimes exhibit (e.g. Kowalski et al. 2010 Schmidt 



et al.|[2012 ). Rather, on each night we model simple hy- 



pothetical flares as fast-rising and exponentially decay- 
ing, and perform a grid search over the start time and 
decay timescale. If any flares have amplitudes that are 
detected at > 4er, we excise that night of data from our 
planet search. These cuts dramatically reduces planetary 
false positives due to flaring activity (e.g. confusing the 
start of the flare with the egress of a transit), with the 
meager cost of ignoring 3.6% of MEarth's observations. 
Because both planetary transits and flares are rare in 
MEarth data, and their overlap even moreso, the losses 
from this strategy are small. 

4.1.3. Planetary Eclipse Model 

The last term in Eq. [2] P(t), includes the signal of 
a hypothetical transiting planet. We model transits as 
having infinitely short ingress/egress times and ignore 
the effects of limb-darkening on the host star, so transits 
appear as simple boxcars. In this section, we are inter- 
ested only in assessing the significance of a single transit 
event falling within a single night, not a periodic train 
of transits. With these simplifications, a lone planetary 
eclipse signal is completely described by a transit epoch 
Pe, a transit duration px, and a transit depth D. The 
signal is then simply 



P(t) 



D ii\t-p E \<p T /2 
otherwise 



(5) 



This model includes only one eclipse event per night. We 
discuss combining the se lo ne eclipses into periodic transit 
candidates in Section l4~3l 



4.1.4. Photometric Uncertainty Model 

A crucial component of the model is o~(t), the photo- 
metric uncertainty of each observation. Our theoretical 
uncertainty estimate for a given datapoint cr t he{t) is a 
lower limit on the true uncertainty. To express this fact, 
we introduce a noise rescaling parameter r a . w such that 



where r a . w > 1. The subscript w emphasizes that this 
is a white noise rescaling parameter that does not ac- 
count for correlations between nearby data points. If 
left unmodelled, such red noise c ould substantially bias 
a transit's detection significance ( Pont et al. 2006); we 
discuss a correction for red noise in j j4.2.6| 

4.2. The Posterior Probability 

For a reasonable choice of parameters, the model in 
Eq. [2] could generate a fake light curve that would have 
most of the features of single night of a real MEarth 
light curve. But how do we pick a reasonable choice of 
parameters? In this section, we write down their prob- 
ability distribution and show how to solve for its peak, 
which turns out to be a linear minimization process with 
slight iterative refinement. 

Considering a single night of observations, we write the 
shape of the probability distribution of these parameters 
as 

P(M|B) oc P(B|M)P(M), (7) 

where P(M|D) is the posterior probability of the model 
M given the data D, P(D|M) is the likelihood of the data 
given the model, and P(M) is the prior probability of 
the model. These functions describe probability density 
distributions that live in an n-dimensional hyperspace 
with as many dimensions as there are parameters in the 
model. 

4.2.1. The Likelihood = P(B|M) 

We describe each of the iV b s photometric observations 
d(U) within a particular night as being drawn from a 
Gaussian distribution centered on m(U) and with a vari- 
ance of a(ti) 2 . Assuming the observations to be indepen- 
dent, the likelihood can be written as 



No 



n 

i=l 



1 



exp 



1 ^ d(U 



m(ti) 



o-(U 



Taking the logarithm, substituting Eq. [6j and defining 



X 



E 



d(U) - m(U) 

Othe(*i) 



(8) 



we find that the (log) likelihood simplifies to 
= -N ohs lnr„ w 



InP 



X 



2rl 



constant (9) 



where we have only explicitly included terms that depend 
on the parameters of the model. For fixed r a ,w, maxi- 
mizing Eq. [9] is equivalent to minimizing the commonly 
used x 2 figure of merit. 



a(t) 



'jCthc(i)- 



(6) 



4.2.2. The Prior 

For any one star, a particular night of MEarth pho- 
tometry may contain roughly as many light curve points 
as there are parameters in our model. As such, the like- 
lihood P(D|M) from one night of data only very weakly 
constrains the parameter space. But of course, each night 
of MEarth observations is just one of many nights span- 
ning an entire season, and we should use this season-long 
information when investigating a single night. To imple- 
ment this holistic awareness of the context provided by 
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a large pool of observations, we generate probability dis- 
tributions for various parameters by looking at the whole 
season of data. We then apply them as priors P(M) on 
the parameters for an individual night. 

By construction, the most important parameters of our 
model arc linear parameters. The conditional likelihood 
of linear parameters (a slice through P(B|M) with other 
parameters fixed) has a Gaussian form. For marginal- 
ization, it proves quite useful for the priors to be con- 
jugate to this shape - that is, also take on a Gaussian 
form. Referring to these linear parameters with the vec- 
tor C ^night ) ^sin ; ^cos : ^CM > ^merid j S x ,i , Sy , i > ^othcr? } 3 

we parameterize the prior P(M) as being proportional 
to a Gaussian distribution in Cj that is centered on an 
expectation value cj and with a variance of 71-2.. Multi- 
plying the independent priors for the A coc f coefficients 
and defining 



<f> 



JVcoef 

E 



(10) 



(11) 



leads to a term in the prior that looks like 

lnP(M) = -i$ 2 + ... 

The similarity in form of <i> 2 to \ 2 is t ne reason that the 
use of conjugate Gaussian priors is often described along 
the lines of "adding a prior as an extra data point in 
the x 2 sum," because the effect is identical in the overall 
posterior. In this framework, the smaller values of ir Cj 
provide tighter constraints on the parameter; we could 
express a fiat, non-informative prior for a particular Cj 
by choosing a large value of ir Cj . We set ttjj — oo, giving a 
flat prior on the transit depth. Note that we allow nega- 
tive transit depths (i.e. "anti-transits") to avoid skewing 
the null distribution of transit depths away from 0. 

For most of the remaining linear parameters, we take 
the values of cj and ir Cj directly from the results of a si- 
multaneous fit to the star's entire season of observations. 
In this prior-generating season-long fit, we fit the season- 
long light curve with a modified version of Eq. [2] that ex- 
cludes both the fj(t) term from flares and the P(t) term 
from hypothetical planets. To immunize against these 
unmodcllcd flares and eclipses, we perform the fit with 
4(7 clipping. We prefer to explain as much of the long- 
term variability as possible with the harmonic terms, so 
we fit first including only these terms in V(t). Then, 
fixing the values of i> s i n and v cos , we fit again with one 
fnight ,j free parameter for each night represented within 
the season. Thus, the values of u n i g ht,j then represent 
the deviation of the nightly flux level from a baseline 
sinusoidal model. 

Now, for the single night flux baseline parameter v n i g ht > 
we set 7r t , night equal to 1.48 x MAD (median absolute de- 
viation) of the ensemble of t> n ight,j values from the season 
fit. Stars that vary unpredictably from night to night will 
have a broad prior for i> n i g ht, thus requiring more data 
within a night to determine its baseline level. Conversely, 
stars that remain constant from night to night or have 
variability that is well described by a sinusoid will have 
a very tight prior. 

To understand the impact of n Vni ht , imagine the fol- 
lowing hypothetical scenario: a nignt in which MEarth 



gathered only one observation of a star, and that obser- 
vation happened to fall in the middle of a transit with a 
0.01 magnitude depth. With what significance could we 
detect this transit? If 7r lJnight = 0.01 magnitudes, then 
the detection significance would be at most la. But if 
knight = 0.001, then the transit could in principle be 
detected at high significance with only the single data 
point, provided the photon noise limit for the observa- 
tion was sufficiently precise. 

We note that s x ^ and s y ^, the coefficients for the x 
and y pixel position templates, would not be expected 
to be constant throughout a season. These terms are 
designed to account for flat-fielding errors, which could 
easily change from week to week or month to month. As 
such, we do not take Cj and Tij from the season-wide fit for 



these parameters. Rather, we fix Cj = and tt. 



0.001 



for all four of these parameters. This has the desired 
effect that an apparent 0.005 magnitude transit event 
that is associated with simultaneous 5 pixel shift away 
from the star's mean position on the detector would not 
be considered as a significant event. 

The most significant non-linear parameter is the white 
noise rescaling parameter r aw . In the season- long fit, 
Eq. [9] indicates that P(M|D) would have a shape of 



In P(r„ 



-iV sca In r a 



Xsca 

2r 2 



(12) 



where Xsea is the season-long \ 2 from the A sca obser- 
vation s in the e nsemble fit. This is maximized when 
r = \fxl c jN Bca . We want the nightly prior on r a , w to 
push it toward f, but we also want to provide enough flex- 
ibility that nights that are substantially better or worse 
than typical can be identified as such. To implement 
this, we mimic the shape of the season-long probability 
distribution but artificially broaden it with an effective 
weighting coefficient JV e ff . Propagating this loose prior 



lnP(M) 



Aoff In r a 



N cS f 2 



(13) 



into the posterior for an individual night, the Maximum 
A Posteriori (MAP) value of r a , w will be 



'XT 



A cff f 2 



eff 



(14) 



We artificially set N g — 4, so on nights with fewer than 
4 observations, the MAP value of r avj will be weighted 
most toward what the rest of the season says. On nights 
with more than 4 observations, the data from the night 

itself will more strongly drive the MAP value. 

We use a modified periodogram ( Irwin et al.|2011a ) as 
part of the season-wide fit to identity the best value of 
Up, the period of the harmonic terms in V(t). We fix 
vp to this value in all later analysis. While this effec- 
tively places an infinitely tight prior on this parameter, 
the degeneracy between it and the other variability pa- 
rameters, especially on the timescale of a single night of 
data, means that its uncertainty is usually accounted for 
by those terms. 

4.2.3. Maximizing and Marginalizing 
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Figure 4. A demonstration of MISS MarPLE applied to one night 
of MEarth observations, with the star's identifier, inferred stel- 
lar radius, and the date indicated above. We show the original 
MEarth photometry (top panel, black points), a variability and 
systematics model with no transits included (same panel, ±1<t; 
blue swath), and a visualization (next 9 panels) of the probability 
distribution of hypothetical transit depths P(D\pe,pt)- In this 
visualization, error bars represent the central ±1ct confidence re- 
gions of the marginalized, Gaussian-shaped P(D\pe,Pt) f° r the 
denoted values of the eclipse epoch p^ (along the time axis) and 
transit duration px (in separate panels). Some of the nuisance vari- 
ability and systematics parameters over which ffMarPLE has been 
marginalized are also shown (bottom panel), with the season- long 
priors in gray and the fits from this night in black. Note that the 
dip in photometry at the end of the night is not seen as a signif- 
icant, because it can be explained as a systematic, in this case a 
variation in the common mode (see the first panel of Figure^ from 
the same night). 



The shape of P(M) offers a big advantage to our goal 
of estimating the marginalized transit depth probability 
distributi on. Ac cou nting for all the terms in P( 
(Eq. [Tj |9j [TTJ and [13]), we find that the posterior P(R 
can indeed be maximized and marginalized analytically. 
For fixed pe, and px, the system of equations 



d d 
— lnP(M\B,p E ,p T ) = — 

OCj OCj 



$ z = (15) 



can be solved exactly for the MAP vector of values Cmap 
using only simple matrix operations. The procedure 
is directly analogous to the problem of weighte d linear 
least squares fitting; see Sivia & Skilling ( |2006 ch. 8) 
for details of this solution! While not strictly neces- 



sary because the priors prevent unconstrained degenera- 
cies in the solution, we use singular value decomposition 
(SVD ) to avoid c atastrophic errors in the matrix inver- 
sions ( [Pressl[2002| . 

The value ot r GW sets the relative weighting between 
the likelihood and the prior. Thus it is important to 
estimate r a , w accurately. We solve for it by iterating 



between Eq. [14] and Eq. [15j the solution typically con- 
verges to the MAP value within only a couple of itera- 
tions. We forego marginalizing over r ajW , instead fixing it 
to its MAP value. Solving for r a w independently on each 
night is a better approximation than blindly assuming a 
global value. 

Importantly, the matrix solution to this problem gives 
not only the MAP values, it also gives the covariance 
matrix of the parameters in the fit, which is an exact 
representation of the shape of P(M\D,pe,Pt), which is 
a multidimensional Gaussian. The diagonal elements of 
this covariance matrix give the uncertainty in each pa- 
rameter marginalized over all the other linear parame- 
ters. Because we have constructed our model in such a 
way that the parameters that most strongly influence es- 
timates of the transit depth D are linear, we can use this 
analytical solution as a robust estimator the shape of the 
Marginalized Probability of a Lone Eclipse. It gives us 
both the maximum a posteriori transit depth D and the 
Gaussian width of the distribution CMarPLE- 

4.2.4. Are the Priors Really Priors? 

As the priors we use to regularize our model fits are 
themselves derived from MEarth data, one might object 
that the division between the likelihood and the prior is 
set somewhat arbitrarily. We include data only from a 
single night in the likelihood and group all the informa- 
tion from the rest of the nights into the prior. Indeed, 
we could have instead organized the entire season of data 
into the likelihood and left the priors uninformative. The 
division is arbitrary, but useful. 

The advantages of treating nights other than that on 
which a candidate transit falls as external to likelihood 
are two-fold. First, it is more computationally efficient: 
instead of recalculating the likelihood of an entire sea- 
son's data when investigating individual events, we only 
need to calculate the likelihood over the relevant night's 
data points. The information provided by the entire sea- 
son changes little from candidate transit to candidate 
transit; thus it is best to store that information as a pre- 
computed prior. 

Second, this organization scheme allows the flexibility 
for individual nights to behave differently. For exam- 
ple, consider the pixel position E xi (t) and E y i (t) terms 
in the systematics model, which capture the influence of 
stars wandering across the detector. As the detector flat- 
field can change from night to night, it would be foolish 
to try to fit an entire season's light curve with one set 
of coefficients for E X} i(t) and E y ^{t); allowing those co- 
efficients to vary from night to night, within a tightly 
constrained prior, is a more useful approach. Further- 
more, dividing the weight of the likelihood and priors as 
we do provides a helpful degree of outlier resistance, by 
not forcing the model on any one night to account for 
strange behavior on one weird night from months before. 

4.2.5. MarPLE in Practice 
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We calculate D and OMarPLE on a grid of single tran- 
sit epochs pe and durations pr- We construct this grid 
for all nights with usable MEarth data. The epochs in 
this grid are evenly spaced by Ape = 10 minutes, thus 
subsampling the typical MEarth observational cadence. 
The durations are evenly spaced from 0.02 to 0.1 days, 
spanning the likely durations for the orbital periods to 
which MEarth has substantial sensitivity. We extend the 
grid of pe before the first and after last observation of 
each night by half the maximum transit duration, thus 
probing partial transits. 

We demonstrate this calculation graphically in Fig. |4j 
First, we show one night of a typical MEarth light curve. 
To give a sense of a baseline systematics and variabil- 
ity model, we show it the ±ltr span of model light curves 
arising from a fit that contains no eclipses. Next, we show 
a visualization of the MarPLE, the probability distribu- 
tion of hypothetical eclipse depths P(D\pe,Pt)- For any 
chosen value of eclipse epoch and duration, the MarPLE 
is Gaussian-shaped; error bars in Fig. [4] show its central 
±lcr width over the entire grid of pe and px- The width 
of OMarPLE can be seen to decrease for longer durations 
Pt, as more data points are included in each transit win- 
dow. For epochs and durations with no in-transit points, 
the transit depth is unconstrained and OMarPLE — > oo. 
The transit-like dip in photometry at the end of the night 
does not register as significant anywhere in the MarPLE. 
The dip can be explained by MEarth's precipitable water 
vapor systematic (see fi 
ing to the same night). 



cipr 

vapor systematic (see first panel of Figure [l] correspond 



At the bottom of Fig. [4] we also include a subset of 
the variability and systematics parameters, the "nuisance 
parameters" over which we marginalize. We show error 
bars representing the Gaussian widths of both the prior 
-P(M), established from the entire season of data, and 
the results of a fit to this one night of data, P(M|D). For 



I'night, the nightly out-of-transit baseline level parameter, 
one night's data are more influential than the relatively 
weak prior, so the fit is notably offset from and tighter 
than the prior. In contrast, for the remaining nuisance 
parameters, the influence of one night's data is very weak, 
so the fit essentially reverts to the input priors. 

In this example, only 10 data points are contributing 
to the likelihood. As the model contains almost as many 
parameters, one might be concerned that we are "over- 
fitting" the data. The bottom of Fig. [I] provide an initial 
step to allay this concern, emphasizing that except for 
the transit depth, each parameter in the fit has its asso- 
ciated prior that provides its own independent constraint 
on the parameter. In a pseudo least squares formalism, 
the presence of these informative priors act as (pseudo) 
data points, ensuring there are always more "data" than 
parameters. 

Furthermore and perhaps more importantly, we could 
indeed be in severe danger of over-fitting if we were in- 
terested in the exact values of the cleaned residuals from 
some single estimate of a best-fit systematics and vari- 
ability model, but we are safe because we care instead 
about the marginalized probability of only one particular 
parameter (the transit depth). Marginalization ignores 
irrelevant information, so we can include a n arbitrary 



numb er of nuisance parameters in the fit (see Hogg et al. 
2010 for discussion). If (and only if) the inferred tran- 



sit depth at any particular pe and pt happens to be 



strongly covariant with one of these nuisance parame- 
ters, then CTMarPLE will include a contribution from that 
parameter. Without the priors degeneracies could po- 
tentially inflate OMarPLE to oo, but with the informative 
season-long priors the nuisance parameters can only vary 
within the range shown in Fig. [4j limiting the degree to 
which they can in turn contribute to OMarPLE- In the 
extreme example, if we had a single data point on a 
night, the cleaned residuals might easily be identically 
zero (i.e. "over-fit") but the MarPLE would accurately 
express what the night told us about the presence or ab- 
sence of transits. 

Estimating OMarPLE across the whole grid of pe for an 
entire season can be performed very quickly. Each grid 
point requires only several SVD's of a matrix whose di- 
mension is the sum of the number of data points within 
the night and the number of linear parameters being fit. 
For a MEarth light curve containing 1000 points and 
spanning 100 days, the whole grid of calculations requires 
several seconds on a typical desktop workstation. 

4.2.6. Ad Hoc Red Noise Correction 

The likelihood in Eq. [9] assumed that adjacent light 
curve data points were statistically independent. If our 
method fails to completely correct for systematics or stel- 
lar variability, this assumption will be violated. Time- 
correlated noise slows the y/~N improvement that would 
be gained by obtaining N independent Gaussian mea- 
surements. So, if we were to ignore the temporal corre- 
lations between data points, we could substantially bias 
our estimates of tTMarPLE- 

Specifically, correlated noise would cause us to overes- 
timate the significance of transits that spanned multiple 
data points. We demonstrate this phenomenon in Fig. [5j 
which shows the MarPLE results for two simulated light 
curves (generated from the real time stamps of a typical 
MEarth target) with different levels of correlated noise. 
One light curve consists of pure white Gaussian noise. 
The other consists of the white light curve averaged with 
a smoothed version itself, scaled so both light curves have 
an identical RMS, roughly approximating a finite red 
noise contribution. We then inject common mode and 
meridian flip trends are injected into both light curves. 
The toy- model simulation of time-correlated noise is very 
coarse but is only meant to serve an illustrative purpose. 

Because each estimate of D is drawn from a Gaus- 
sian distribution with a width CMarPLE: the quan- 
tity D I OMarPLE should ideally be Gaussian-distributed 
around with a variance of 1, except when real tran- 
sits are present. For the light curve with pure white 
noise, this is true for all transit durations in Fig. [5] (see 
the histograms at right). For the light curve with signifi- 
cant correlated noise, we underestimate OMarPLE and the 
distribution of -D/oMarPLE appears broadened for some 
durations. The effect is most pronounced at longer du- 
rations, where more data points fall within each tran- 
sit. For the shorter durations, typically only one or two 
light curve points fall within a transit so the red noise 
does not substantially affect our estimate of CMarPLE- 
This general behavior, of overestimating the significance 
of longer duration transits, is common among MEarth 
targets whose light curves show features that are poorly 
matched by the input model. 
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Figure 5. A demonstration of the effect of red noise on the inferred significance of transits, showing the signal-to- noise ratio of hypothetical 
transits with all possible epochs (along the x-axis) and durations (denoted by color). D, CMarPLEi an d colors are the same as in FigureR] 
Results are shown for fake light curves generated from the time-stamps of a real MEar th tar get assuming either white Gaussian noise (top) 
or correlated red noise (bottom), before applying the red noise correction described in |4.2.6| Histograms (right) indicate that uncorrelated 
white noise leads to D/aMnrPLE following a unit- variance Gaussian distribution (black curves) for all durations, whereas red noise in 
the light curve broadens the distribution, especially for long duration transits. If uncorrected, this would cause us to overestimate the 
significance of candidate transits. 



To account for the problem, we posit that each light 
curve has some additional red noise source that can be 
expressed as a fixed fraction of the white noise, defining 
r ajr as the ratio of red noise to white noise in a light 
curve. With this parameterization, the transit depth 
uncertainty associated with a transit that contains AT tra 
data points becomes 



CMarPLE = OjMarPLE, 



1 



(16) 



where UMarPLE,™ is the estimate of OMarPLE that ac- 
counted only for white noise. To determine its optimum 
value, we scale r a ^ r until the distribution of D / OMarPLE 
has a MAD of 1/1.48 (i.e. the distribution has a Gaus- 
sian width of unity). This correction is similar to the 



V(n) formalism described by Pont et al. (2006). Hence 



forth, when we use the term CMarPLE, we are referring to 
its red-noise corrected value. 
A more ideal solution would account for_ time- 
but 
LE's 



correlated noise directly in the likelihood (Eq. KM 
doing so would substantially decrease MISS Mar : 
computational efficiency. As such, we settle on Eq. [16] 
as a useful ad hoc solution. Fig. [6] shows the amp]> 
tude of r CTjr for all stars in the MEarth survey, indicating 
that most stars have low red noise contributions, after 
accounting for our stellar variability and systematics. 

4.3. Phasing Multiple MarPLE's Together 

MISS MarPLE, as just described, investigates the sig- 
nificance of a single transit event. The method can be 
straightforwardly extended to search for periodic transit 
candidates as well. Once D and UMarPLE have been cal- 
culated over a grid of pe and Pt, characterizing periodic 
candidates is simply a matter of combining all precom- 
puted lone eclipses from this grid that match the appro- 
priate period pp and starting epoch peo- In Kepler par- 
lance, this is the step where Single Event S tatistics are 
combined i nto Multiple Event Statistics (see |Tenenbaum 
eFaL][2012). 

Given pp and peo, we identify those values of pe that 
fall within 5 minutes of this linear ephemeris and that 
have finite values of UMarPLE- Each lone eclipse carries 
its own Gaussian distribution in D. Multiplying these 



independent Gaussians together leads to the standard 
inverse- variance weighted average: 



D 



phased 



^phased 



J2 ^/ a MarPLE,i 
S V ^MarPLE, i 



(17) 



(18) 



pi.;-.. 



where the sums are performed over the A^poch epochs 
that were observed for a given candidate period and 

Starting epoch. If Xphasod = E(A ~ phased) 2 /^MarPLEJ 

is greater than A epoc h, we take it as an indication that 
the uncertainties would have to be underestimated if 
that candidate ephemeris were real. In this case, we 
rescale ceased 2 up by a factor of X 2 hasod /N cpoch . In other 
words, we enforce that the independently measured tran- 
sit depths that go into each phased candidate must agree 
to within their errors. 

Because we evaluate the MarPLE on grid of epochs 
that is super-sampled with respect to both our observa- 
tional cadence (20 minutes) and typical transit durations 
(0.5 to 2 hours), adjacent values of pe will have highly 
correlated transit depth estimates. This simply reflects 
that a (complicated) binning over the transit duration 
Pt has already gone into these estimates. Whereas the 
above sums would be over all in-transit light curve points 
in a traditional BLS, with MISS MarPLE we include only 
one term in the sum for each independent event. 

To perform a full search, we repeat this procedure on 
a grid of periods. Because we hope to identify planets 
with potentially very few events, it is absolutely crucial 
that we explore a fine enough grid in periods that we 
not miss any peaks in the probability distribution. We 
set App so that when moving from one period to the 
next, the first and last data points of a season move by 5 
minutes with respect to each other in phase (leading to 
exponentially spaced candidate periods). MEarth target 
star mass and radius estimates are reliable t o 30-35% (or 
better for those sta rs with parallaxes, see |Nutzman fc| 
Charbonneau||2008[); we use this information to search 
only up to the transit duration of a planet in a circular 
orbit with impact parameter for each period. 
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Figure 6. Histogram of the red noise rescaling factor rv jr (see Eq. \16\ in each of the four MEarth seasons of observations, estimated 
on different transit duration timescales. We experimented with a narrower filter in the 20f0-20f 1 season in the hopes of alleviating our 
precipitable water vapor systematic; we found its long wavelength cutoff to be sensitive to humidity and temperature, exacerbating the 
problem and resulting in increased red noise for this year. 



This search is the most computationally intensive step 
in the process. Searching a typical MEarth season re- 
quires roughly 10 5 candidate periods and 10 minutes 
on a desktop workstation, using vectorized IDL code. 
Searching multiple seasons requires up to 10 6 periods, 
thus needing correspondingly more time. 

A brief side note: cleaning methods like TFA or EPD 
can be run in a "r econstructive mode ," in which they fit 
away systematics (| Ko yacs et al.|2005 ) and/or variability 
( Kovacs & Bakos 2008 ) towards a known signal present in 
the data. Such reconstructive techniques have generally 
not been applied when running period searches for plan- 
ets, because the computational cost of rerunning them 
for all possible transit periods, durations, and epochs is 
untenable. When calculating P(D\pe,Pt) with MISS 
MarPLE, we are performing an analysis that is in some- 
ways similar to a reconstructive TFA/EPD (i.e. fitting 
systematics and variability in the presence of a candidate 
transit). But in the case of MISS MarPLE, we first per- 
form this analysis on individual transits using data from 
individual nights, and phase up the results to candidate 
periods second. Thus we postpone the combinatorics of 
the period search until after the costly m atrix in vers ions. 

The form of the weighted sums in Eq. [17] and [18] high- 
lights an important feature. Events with few observa- 
tions in a night, events that fall on nights with poor 
weather, events that correlate with the star's position 
on the detector, events at high airmass, events on nights 
where a star is acting weirdly - namely, bad events - will 
have large CMarPLE's and be naturally down- weighted 
in the sum. In contrast, good events falling on well- 
sampled, well-behaved nights will get the credit they de- 
serve, exactly as we want. The advantages extend even 
further, in that this sum can span beyond a single tele- 
scope or a single season, enabling the straightforward 
combination of data from multiple sources with multi- 
ple systematics and even at multiple wavelengths into a 
coherent whole. 

5. RESULTS 

We apply MISS MarPLE to real MEarth light curves 
for which we have at least 100 observations in a season, 
and discuss two aspects of the results here. First, we 
investigate the properties of simulated transits injected 
into MEarth light curves, in order to provide concrete ex- 
amples and compare MISS MarPLE with other methods. 
Second, we show that the method behaves well when ap- 
plied to the ensemble of real MEarth light curves and 
does not generate an overabundance of false positives. 



Throughout this section, we occasionally point to 
-D/cMarPLE = 3 as a characteristic value of interest. 
MEarth light curves span typically a few thousand in- 
dependent transit durations, so we expect to find several 
3<7 events by chance in each. However, a single candidate 
event identified at -D/oMarPLE > 3 significance would 
be sufficient to set off MEarth's real-time trigger, which 
would immediately gather new observations to confirm 
or deny the event. Triggered observations could poten- 
tially magnify the significance of the single transit until 
the chance of it being a false alarm is low: a single tran- 
sit at 5cr should formally be expected by chance about 
once per 3.5 x 10 6 independent epochs tested, roughly 
comparable to the number of epochs probed across all 
the stars in the MEarth survey to date. These thresh- 
olds for single events are much lower than that required 
to eliminate false positives from a phased search f or p e- 
riodic candidates, which as we discuss in Section |5.2| is 
closer to 7 or 8a. 

5.1. Injected Transits 

To show how known transits appear through the lens 
of MISS MarPLE, we inject simulated transits into each 
of our raw light curves. Then we apply MISS MarPLE, 
and compare the significance of the recovered signals to 
those we injected. For the simulations, we inject 50,000 
fake 2-4R0 planets into each MEarth target star, with 
periods from 0.5 to 20 days, random phases, and im- 
pact parameters between and 1. The transits are limb- 
darkened, us ing quadratic coefficients for an M4 dwarf 



( |Claret|2004l ). 

We characterize each simulation by an "injected S/N": 
the injected transit depth Anjected = (Rp/R*) 2 divided 
by cr injcctGd . We calculate erinjected by a (J2 1/ct 2 )" 1 / 2 es- 
timator, using data points between 2 nd and 3 rd contact of 
the injected transit, with a global rescaling to match the 
RMS of the star's MAP-cleaned light curve. In the con- 
text of other transit detection algorithms that pair BLS 
with a pre-search cleaning step, this Anjected /^injected 
has an important meaning. It would be the detection 
significance BLS would recover for a transit candidate if 
the pre-search data cleaning perfectly removed variabil- 
ity without influencing the depth of any transit events. 
Under the assumptions of this idealized BLS, the quan- 
tity -D iniected/omiected is directly linked (see Burke et al. 
2006) to th e "signal residue" d etection statistic m the 
BEIT paper ( |Kovacs et aT]|2002 ). 

5.1.1. Individual Examples 
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Injected a 2.5R ffi , P = 9.89 day, b = 0.1 planet at 

recovered it as a 2.6R ffi candidate at 9.2a with MISS MarPLE 
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Figure 7. An end-to-end demonstration of MISS MarPLE applied to simulated transits injected into a real MEarth light curve. For this 
candidate, we show D / o~ MarPLE ■> or the marginalized S/N, for all possible transit epochs and durations (top), both as an ordered timeseries 
(left) and as histograms at fixed duration (right). We also show MEarth photometry (bottom, filled circles, with grayscale proportional to 
1/tT 2 ) centered on the individual transit events (l st -3 rd columns), phased to the injected planetary period (4 th column), linearly arranged 
in time (5 th column), and linearly arranged in observation number (6 th column, with nightly gaps denoted). Light curves are shown for 
basic MEarth photometry (I s * row), after subtracting the systematics model to show stellar variability (2 nd row), and after subtracting 
all aspects of the model except for planetary transits (3 rd row) , along with samples from the probability distribution from our light curve 
model in each panel (blue swaths). Points in-transit are marked throughout this figure. 



njected a 2.5R ffl , P = 9.33 day, b = 0.5 planet at "9.5ct" into a 0.20R o star with 50% red noise; 
recovered it as a 2.3R,n candidate at 5.1ct with MISS MarPLE 
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Figure 8. Another demonstration as in Figure [?1 but for a more challenging star. In this case, a high residual red noise fraction and 
strong covariance between the systematics/ variability model and the transit depth limit the recovery significance of this injected candidate. 
In a phased search, this candidate would not stand out as strong. 



1G 



Berta et al. 



We present a couple of illustrative simulations, to give 
a sense of how MISS MarPLE works. In each case, we 
use fake planets with three observed transits and periods 
near ten days. While such long periods would realisti- 
cally offer this many transits only rarely, we use these 
hand-picked examples as a convenient way to show both 
what individual transits of 10 day periods planets look 
like, and what phasing these transits into periodic candi- 
dates looks like. For simplicity's sake, we left MEarth's 
real-time trigger out of these simulations, showing what 
individual transits and phased candidates look like in 
low-cadence data. In reality, most of the injected transits 
above 3ct would have been detected by the real-time trig- 
ger, and their egresses' populated with additional high- 
cadence observations. 

Figure [7] shows one example, a radius planet 

with a P = 9.89 day period and b — 0.1 impact parame- 
ter injected into the raw MEarth light curve of a O.21R 
star. In this case, the 8. Oct injected S/N of the transit 
is well recovered by MISS MarPLE at 9.2ct, as is the 
inferred planet radius. Three transits fell during times 
of MEarth observations; they are marked in the plot of 
-D/CTMarPLE, the eclipse S/N. This star exhibits 0% resid- 
ual red noise and the transits all fall within well sampled 
nights; it is thanks to these favorable conditions that the 
injected and recovered S/N's are so similar. 

For contrast, Figure [3 shows another example with a 
different star but broadly similar planetary parameters. 
Here, the recovered signal's 5.1ct significance is consider- 
ably lower than its injected 9.5ct strength. One reason for 
the difference is that the timescale of the intrinsic stellar 
variability of this star is short enough that the inferred 
transit depths are substantially correlated with it, thus 
making a larger contribution to CTMarPLE- Additionally, 
our model does not completely remove all the structured 
features in this light curve so it exhibits a large red noise 
fraction (r CT , r = 0.5), further suppressing the detection 
significance. 

We also show in Figures [7] and M the photometry from 
MEarth, before and after using the MAP values of our 
model parameters to subtract off systematics and stel- 
lar variability from the light curves. To emphasize that 
the result of MISS MarPLE is not simply one best-fit 
model of the systematics and variability, but rather an 
inferred probability distribution, we plot the swaths of 
light curve space that are spanned at ±1ct by this distri- 
bution of models. We note that the probability distribu- 
tion P(M.\pp,pe ,pt), is conditional on transit period, 
epoch and duration, so when we visualize the models 
with the light curves, we have fixed these parameters 
to t heir best values (as found in the grid search in Sec- 
Because the transit search is entangled with 



< 25% red noise 
25-50% red noise 
> 50% red noise 



based on 5x10 
fake 2-4R~ planets 
injected into 
each star 
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tion 

the cleaning process, the models and appearance of the 
MAP-cleaned light curve would be different for different 
choices of pp, peo, and px- 

5.1.2. Relationship to BLS 

By itself, a search with BLS will give the significance 
of a candidate transit that is conditional on the assump- 
tion that the out-of-transit baseline flux is constant and 
that its noise properties are globally known. If preceded 
by a light curve cleaning step, the transit significance is 
also conditional on the assumption that the aspects of 
the cleaning are correct. An important question is how 
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Figure 9. A comparison of the significance achieved in a phased 
search with MISS MarPLE (^phascd/^phascd) vs - an idealized BLS 
injected /^injected)* ^phased and (Tpj lase( j represent the phase- 
folded combination of in-transit MarPLE's, as in Eq. |17| and |18| 
The definition of f injected /^injected is such that it represents a 
hypothetical in which any pre-BLS cleaning proceeded perfectly 
and without influencing the injected transit depth (see text). Each 
MEarth target star is represented once in this plot by the median of 
4 X 10 4 simulations of planets with random periods, phases, impact 
parameters, and radii. The average significance ratio for each group 
of residual red noise factors r,y >r is shown (dashed lines); as most 
transits in these simulations contain only 1-2 points the impact of 
the red noise is relatively muted. 



much the marginalized significance of candidate transits 
found with MISS MarPLE differs from this conditional 
significance. Generally, the answer to this question will 
depend on the time sampling of the observations; for a 
very well-sampled and well-behaved light curve, the BLS 
and MarPLE results should converge to the same answer. 
But for the case of the real MEarth data, with its large 
gaps and fickle systematics, we approach this question 
with simulations. 

Figure [9] shows the results of a head-to- head compari- 
son of the significance with which MISS MarPLE views 
phased (multiple-event) candidates with the significance 
that would go into a BLS calculation, based on ensem- 
ble of injected transits. A full period search was not run 
as part of these simulations; we calculated the detection 
statistics in both cases assuming the period was known. 
This is in line with our goal with MEarth, that we wish 
merely to identify whether a signal of a given significance 
is present, not accurately determine the period of that 
signal from the existing data. 

For MEarth's best behaved stars (with r a , r < 
0.25), the marginalized significance estimated by MISS 
MarPLE is typically 80% of that estimated by our ide- 
alized BLS. For these stars, properly accounting for all 
of the uncertainties in the cleaning process gets us to 
within 20% of the significance we could achieve in the 
unrealistic hypothetical that there were no uncertainties 
in the cleaning process. That MISS MarPLE tends to 
be more conservative than other methods is very impor- 
tant to our ultimate goal of using candidates identified 
by MISS MarPLE to invest limited period-finding follow- 
up observations. The 20% factor suppression of transit 
significance is comparable to the degree to which global 
filtering methods such as TF A suppress estimate d tran- 
sit depths (e.g. HATNet, see|Bakos et al.||2012[). How- 
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Figure 10. The results of "time-machine" simulations, in which we inject single transits into MEarth light curves and attempt to recover 
them, using only data up to and including the transit. As a function of how many nights the target was observed before the candidate 
transit (and thus how tight the priors can be), we show the smallest planet that could be identified at > 3rr in a single event (top) and 
the ratio of the recovered (ffMarPLE) to injected (<r in j ecte( £) transit depth uncertainties (bottom). Each panel shows results from 50,000 
injected transits in each of 100 random stars, with error bars representing the 25% and 75% quartiles of the distribution. We show the 
best (left) and worst (right) halves of the MEarth sample, based on how their average white noise rescaling parameter compares to the 
median of the sample (r a , w = 1.24). The lower envelope of each distribution typically corresponds to transits that fall in the middle of 
well-sampled nights; it converges as soon as tight priors can be established for the systematics coefficients. The upper envelope corresponds 
more to transits at the starts of nights or in poorly sampled nights; it converges more slowly, depending strongly on the priors for both the 
systematics and the variability coefficients. 



ever, the advantage of MISS MarPLE is more than sim- 
ply knowing how much light curve cleaning suppresses 
transit significance on average; it is knowing what the 
cleaning's relative influence is on individual events and 
which events are more, or less, reliable. MISS MarPLE 
can give good events on good nights appropriately higher 
weight, unlike more global methods. 

Figure [9] also shows that the penalty imposed by the 
red noise correction for those stars with r a ^ r > 0.25 is sig- 
nificant but not always debilitating. Because MEarth's 
cadence is so low that typically only 1-2 points fall within 
any given transit window, the influence of red noise on 
most transits is relatively small. However, in cases where 
the cadence is much higher, such as a triggered event ob- 
served in real-time with MEarth, the red noise penalty 
could be much steeper. Also, as Mnjectcd/ci n j G ctcd is the 
best we could hope to achieve for each candidate, it is 
an important check that very few stars show significance 
ratios > 1. 

5.1.3. Evolution of Priors 

As more nights of observations are gathered, the pri- 
ors on the systematics and variability parameters associ- 
ated with a particular star will tighten. As these priors 
tighten, the significance with which a given transit can be 
detected will improve. W e demonstrate this phenomenon 
graphically in Figure 10 which shows how erjyiarPLE for 
single events evolves as more observations are gathered 
as well as the impact of this evolution on the planet de- 
tection. 

We injected transits as before but calculated the 
MarPLE for every individual event using only the data 
up to and including the event, excluding all data af- 
ter 3 rd contact. These "time-machine" simulations are 
an approximation to the information available to the 
MEarth real-time trigger system when deciding whether 
to gather high-cadence followup of a candidate transit. 
We show the results for stars in the best and worst halves 
of the MEarth sample, as judged by how their white noise 
rescaling factors r aiW compare to the median of the sam- 
ple TVuJ = 1.24 . Note that transits have a distribution of 



injected transit depth uncertainties (ci n j ec t e d), based on 
the number of points in transit and the points' relative 
predicted uncertainties ff th c (£) . 

We highlight in Figure HO] the smallest planet that 
could be detected at 3er confidence in a single low-cadence 
event, and how this quantity evolves a function of the 
number of nights a star is observed before the event. 
Imagine a light curve contains 99 eve nt-l ess nights and 
one event on the 100th night; Figure [10] indicates how 
much the information in the event-less nights improved 
the reliability of the single event's detection. In each 
panel, we show the 25 and 75% quartiles of the distri- 
bution (spanning both multiple stars and multiple ran- 
dom transits). For the stars with low r a ^ w , initially only 
planets larger than 2.5-3.8R,0 exhibit deep enough tran- 
sits to be detectable. But as more nights of observa- 
tions tighten the priors, 2.0-2.6R® planets become de- 
tectable, approaching the injected distribution. Stars 
with high r a ^ w behave very differently, presumably be- 
cause our model captures fewer of the features present in 
the light curves. For these stars, the minimum detectable 
planet sizes initially span 3.O-4.4R,0 and never converge 
to the injected values. 



We also show in Figure 10 the distribution of the ratio 
CTMarPLE/cinjected for the simulated transits. The ratio 
starts off well in excess of unity, but approaches it as more 
prior-establishing observations are gathered. The range 
of values it spans corresponds to transits falling at more 
or less opportune moments. Values of CTMarPLE/cinjected 
closer to 1 are usually associated with transits that fall in 
the middle of a well-behaved night. Higher values corre- 
spond to events that fall at the start of a night, events in 
a night with high excess scatter, or events that coincide 
with transit-like features in the systematics or variabil- 
ity models. By the end of a season, the distribution of 
OMarPLE/o-injcctcd for single events in Figure [To] roughly 
approaches that for phased candidates in Figure [9] This 
makes sense, as the phased S/N ratios in Figure [9] use 
priors established from all the nights. 

5.2. Application to MEarth Data 
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Figure 11. The per-point RMS photometric uncertainty as predicted from a CCD noise model (open circles) and that ultimately achieved 
in MarPLE-cleaned photometry, after subtracting off MAP models for systematics and stellar variability (filled circles). In each case, flares 
and in-transit points for each star's best candidate have been excluded from the calculation of the post-cleaning achieved RMS. Note that 
the improvement in the RMS relative to Figure|3]is achieved without blind suppression of planetary transits, as the MISS MarPLE cleaning 
occurs in tandem with the search for transits. 



Finally, we employ MISS MarPLE to analyze all the 
MEarth target stars with no transits injected into them. 
Figure [TT] gives one summary of the method's effective- 
ness. Here, we plot the achieved RMS in MEarth light 
curves after using MAP models of the systematics and 
variability to clean the light curves. Comparison to 
Figure [3] shows a dramatic improvement, moving the 
achievea RMS for all the stars much closer to their the- 
oretical minima. However, the achieved RMS values still 
lie on a locus with a slight upward offset, indicating that 
our cleaning does not quite reach the photon noise limit. 
Indeed, this is a reflection of our finding that the median 
white noise rescaling parameter is r„ :W = 1.24. Figure [XT] 
also shows no evidence that we are over-fitting, in that 
we never ac hiev e an RMS lower than predicted. 

In Figure 12 we show the period and detection sig- 
nificance of the best phased candidate that we identify 
for each MEarth star using our MarPLE-based search. 
Here we have searched only one season of photometry at 
a time, so the same star may appear in multiple panels 
if we had multi-year observations of it. MEarth has pub- 
lished two systems with planet-size d eclipses: the planet 
GJ1214b ( iCharbonneau et al.|2009 ) and the brown-dwarf 
NLTT41135 fllrwin et al.|20l0j ). The latter system is in a 
visual binary that was unresolved in the MEarth discov- 
ery data, so its eclipse depth was diluted to a planet-like 
2% depth. While these sys tems were discovered b y using 
an iterative median-fi lter dAigrain fc Ir win | |2004[ ) paired 
with traditional BLS ( |Kovacs et al.|2U(J2p , we recalculate 
their detection significances using MISS MarPLE and in- 
dicate them in Figure [12] We also indicate the long- 



period low-mass eclipsingbinary LSPM Jl 112+7626 (Ir- 



win et al. 2011b), which was detected at very high sig- 



nificance (30(7 J with the real-time detection trigger. Not 



shown is the sho rt-period eclipsing binary GJ 3236 (Ir 
win et al.|2009a ) , as it was identified by eye in MEartlTs 
commissioning data before the 2008-2009 season. 

Several new candidates were initially identified above 
8a significance in Figure [12] but upon inspection the sig- 
nals were found to be associated with bad raw images. 
The candidates evaporated after we removed these bad 
images from consideration. While we are actively in- 



vestigating the most promising remaining candidates in 
Figure |12[ none are as convincing as were our original 
confirmed systems in their discovery data. 

The morphology of the plots in Figure [12] is roughly 
what we expect. Due to geometry, most of our stars 
will not host exoplanets that transit. Initially, one might 
think then that the cloud of candidates hovering around 
5 — 6cr must mean we are substantially overestimating the 
significance for all of our stars. However, we must con- 
sider what makes a reasonable detection threshol d for a 
phased p l anet s earch. As discussed in detail by |Jenk 



ins et al. (20021, each phased search for planets consti- 



tutes an enormous num ber of effective hypot heses being 
tested against the data. |Jenkins et al. ( 2002 ) found that 
a phased search of a Kepler light curve, with continuous 
cadence and a 4-year baseline, corresponded to an esti- 
mated number of equivalent independent tests (-/Veit) of 
Neit = 1.7 x 10 7 . That is, the detection statistic ex- 
pected from searching a transit-free Kepler light curve 
would be the same as asking for the maximum value 
achieved in 1.7 x 10 7 draws from a unit-variance Gaus- 
sian; the median null detection statistic should be above 
5cr. It is this consideration that leads to the 7.1(7 detec- 
tion threshold for the nominal Kepler mission. 

Although the relationship is complicated, generally 
A^eit increases with the number of observations gath- 
ered, the number of periods, and the number of inde- 
pendent phases searched. Because 1-hour transits of M 
dwarfs are much shorter than the 10-hour transits typi- 
cal for Kepler, we search many more phases for any given 
period. While the gap-filled, single-season MEarth light 
curves going into Figure 12 have very different proper- 
ties than Kepler's, an estimate of A^eit on the or der of 
10 7 is still a decent estimate. Indeed, using the Jenk 



ins et al. (2002) bootstrap simulation method, we esti- 
mated for a MEarth light curve with 10 3 data points in 
which we searched 10 5 periods that Aeit ~ 5 x 10 6 . Null 
detection statistics above 5a should be a regular occur- 
rence in phased searches of MEarth targets. The posi- 
tion of MEarth's confirmed targets in Figure 12 suggests 
a 7 — 8(7 threshold is probably appropriate for MEarth. 
Thresholds could safely be much lower for detecting sin- 
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gle events, closer to 5a, without the brutal combinatorics 
of a phased search. 

Although it is too computationally intensive to calcu- 
late N-eit for the different observational coverages repre- 
sented by all of the MEarth targets, we try graphically to 
demonstrate the effect of N-eit in Figure [12] We fill the 
symbols with an intensity proportional to the number of 
independent eclipse epochs (j>e) that the light curve cov- 
ers, using this as a very rough proxy for N-eit- This color- 
ing scheme yields a vertical color gradient in all panels, 
reflecting the fact that targets with more observations 
have generally higher -/Veit and are more likely to gener- 
ate high null detection statistics by chance. 

Time-correlated noise can also disturb the frequency 
stability of the phased search, if correlations exist 
over timescales comparable to planetary periods being 
searched. For example, some uncorrected effect with a 
1 day -1 frequency could build up over subsequent nights 
into what might look like a period ic planet signal. Our 
ad hoc correction in Section 14.2.61 does not account for 
that aspect of time correlated noise. It is likely that such 
extra uncorrected trends in the 2010-2011 season (which 
exhibited excess correlated noise, see Tabl e [T ) leads to 
excess of 1 day period candidates in Figure 

6. FUTURE DIRECTIONS 

MISS MarPLE could be applied to other ground-based 
surveys for transiting exoplanets. Its advantages will be 
greatest for other pointed surveys like MEarth, where 
individual observations of individual stars are costly 
enough that it is worth the effort of optimally charac- 
terizing the information that each contributes. Aspects 
of MISS MarPLE be potentially useful to other surveys 
specifically targeting M dwarfs, such as PTF/M-dwarfs, 
APACHE, or RoPACS, where the variability and/or sys- 
tematics are similar to those we described here. 

Additionally, ground-based photometric followup to 
find t ransits of radial velocity planets (e.g. Kane et al. 
2009 1 faces similar challenges. Typically looking tor shal- 
low transits in light curves of bright stars, such efforts 
require careful consideration of the systematic uncertain- 
ties associated with candidate events. For example, the 
RV-detected super-Earth HD97658b, initially announced 
to transit from ground-based photoelectric phot ometry 
at its pred icted time and with 5.7a confidence (Henry 



correlations am ong data points (see, for example, [Carter 
fc Winn||2 009[) 
ligh 



et al. 



2011), was fo und not to transit in f ollowup space 



based photometry (Dragomir et al. 2012). This contra- 
diction led to a reevaluation of the systematics in the 
ground-based observations, which were taken at high air- 
mass. As the most exciting planet discoveries will often 
be those made very close to the detection threshold, it 
is important to accurately assess the uncertainties as- 
sociated with the measured depths of putative transits. 
Some aspects of a method like the one we proposed here 
could be useful to marginalize over systematic uncertain- 
ties and thus give more confidence in the significance of 
transit detections in future followup efforts. 

Many improvements could be made on our current im- 
plementation of MISS MarPLE. For one, the Gaussian 
likelihood we use to describe our data (Eq. [91 is an ap- 
proximation. It is decent, but it could be elaborated by 
including a mixture of probability distributi ons for each 
data point (to account for j unk outliers; e.g. Hogg et al. 
2010[ |Sivia fc Skilling||2006[ ) or by directly modeling the 



Also, the variability aspect of our gener 
ight curve model is extremely simplistic (Eq. ffl. 

nightly offset model 



ative 

By replacing our crude sinusoid 
with a more sophisticated basis, one might be able to bet 
ter capture all the variability features in real light curves, 
thus minimizing the uncertainty its correction injects into 
the marginalized probability of lone eclipses. In particu- 
lar, a variability m odel based on Gaussian processes (see 
Gibson et al. 2011 for an introduction) may be a promis- 
mg route for setting dynamically evolving priors for the 
astrophysical behavior of a star on any given night. 

In an upcoming paper, we intend to apply the MISS 
MarPLE framework to the task of estimating MEarth's 
sensitivity to 2-4R ffi planets over the last four years. 
Given our single planet detection of GJ1214b, we will 
use this survey sensitivity estimate to place limits on the 
occurrence rate of short-period planets around nearby 
mid-to-late M dwarfs. Such limits would be complemen- 



2012) 



tary to results both from Kepler ( |Howard et al 
and from the HAR PS M dwarf radial velocity program 
( |Bonfils et aLpOlI ) 



Finally, our ultimate goal with MISS MarPLE is to 
identify promising candidates with MEarth and make 
follow-up observations to determine their periods. With 
this new well-tested method, we plan to focus our ef- 
forts in this direction in the years to come. Determining 
how to schedule the most useful observations for period- 
finding is a difficult task, but the "adaptive sch edul- 
ing" algorithm proposed by Dzigan & Zucker ( 2011[ ) may 
prove a very fruitful route. 

7. CONCLUSIONS 

In this work, we have proposed a new method for 
detecting planetary transits in wiggly, gap-filled light 
curves. A method such as this is necessary to eke the 
optimal sensitivity out of the MEarth Project, our sur- 
vey for transiting 2-4R ffi exoplanets around nearby mid- 
to-late M dwarfs. MEarth's unique observing strategy 
gives rise to new challenges (for example, FigureJT]), thus 
inspiring our efforts to improve on existing transit detec- 
tion techniques. 

One idea lies at the core of our new method: that 
when assessing the significance of any individual plane- 
tary transit, we want to marginalize over all the uncer- 
tainties, including those associated with cleaning system- 
atics and intrinsic variability from the star's light curve. 
Our Method for Including Starspots and Systematics in 
the Marginalized Probability of a Lone Eclipse (MISS 
MarPLE) can investigate transits within the context of 
individual nights of observations (see Figure [4]), sensibly 
accounting for various kinds of trends, occasionally messy 
observational cadences, and the vagaries of photometric 
conditions common to ground-based observatories. MISS 
MarPLE uses an analytic, semi-Bayesian approach to in- 
clude information from an entire season of observations 
as priors to constrain the expected behavior of a star on 
any given night. 

We applied MISS MarPLE to four seasons of MEarth 
photometry, showing that it improves our sensitivity to 
transiting exoplanets (Figures [3] and 11). By inject- 



ing simulated transitingplanets into real MEarth light 
curves (Figures [7] and pi, we compare MISS MarPLE 
to the popular Box-fitting Least Squares (BLS) method 
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Figure 12. A summary of the application of MISS MarPLE to four individual seasons of MEarth data. Each point represents the best 
periodic candidate identified from a phased search of one star. As each phased transit search effectively performs many effective independent 
tests on the data, the position of the dense locus of candidates between 5 and 6cr is broadly consistent with the null hypothesis, of most of 
our stars not exhibiting planetary transits. For each star, the intensity of the symbol's color fill is proportional to the number of lone-eclipse 
epochs (pe) for which observations exist. We also show the detection significance of published MEarth systems, based on the discovery 
data for each. 



(Kovacs et al. 2002) and find that even for the best 
behaved MEarth targets, BLS underestimates the true 
transit depth uncertainties typically by 20% (Figure [9]). 
That is, the covariance of hypothetical transit depths 
with systematics and variability corrections, on average, 
increases the true transit depth uncertainty by 20% for 
MEarth survey data. Simulations also show that 2-3R® 
planets that are undetectable in the first few weeks a tar- 
get is observed become detectable, either in archival data 
or in incoming data, later in the season as the behavior 
of the star is better constrained (Figure 10). 

The "MarPLE," the probability distribution of hypo- 
thetical transit depths for any given transit duration and 
epoch, is a useful concept. Because this probability dis- 
tribution is designed to be marginalized over all the com- 
plicated factors associated with the telescope or the night 
on which the observations were taken, it can be treated 
as a rigorous statistical summary for the presence or ab- 
sence of a transit at any moment. Thus, we can straight- 
forwardly combine these portable MarPLEs estimated 
from different telescopes using different filters at differ- 
ent observatories into coherent planet candidates. By 
properly accounting for so many transit detection uncer- 
tainties, the MarPLE should also save precious followup 
resources by not wasting time on too many false alarms. 
A framework such as MISS MarPLE could be a useful 
tool for any collaborative, global followup of long-period 
transiting exoplanet candidates that may be identified 
by MEarth or other observatories. 

As the search for transiting planets around nearby 
stars pushes to radii smaller than 2R e , properly account- 
ing for systematics and variability will be become ever 
more important. MISS MarPLE may prove to be a valu- 
able asset in the hunt for transiting exoplanets around 
bright M dwarfs in the years to come. 
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